Towards a fast parallel sparse matrix-vector multiplication

نویسندگان

  • Roman Geus
  • Stefan Röllin
چکیده

The sparse matrix-vector product is an important computational kernel that runs ineffectively on many computers with super-scalar RISC processors. In this paper we analyse the performance of the sparse matrix-vector product with symmetric matrices originating from the FEM and describe techniques that lead to a fast implementation. It is shown how these optimisations can be incorporated into an efficient parallel implementation using messagepassing. We conduct numerical experiments on many different machines and show that our optimisations speed up the sparse matrix-vector multiplication substantially.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-dimensional cache-oblivious sparse matrix-vector multiplication

In earlier work, we presented a one-dimensional cache-oblivious sparse matrix–vector (SpMV) multiplication scheme which has its roots in one-dimensional sparse matrix partitioning. Partitioning is often used in distributed-memory parallel computing for the SpMV multiplication, an important kernel in many applications. A logical extension is to move towards using a two-dimensional partitioning. ...

متن کامل

Optimizing Parallel Sparse Matrix-Vector Multiplication by Partitioning

Sparse matrix times vector multiplication is an important kernel in scientific computing. We study how to optimize the performance of this operation in parallel by reducing communication. We review existing approaches and present a new partitioning method for symmetric matrices. Our method is simple and can be implemented using existing software for hypergraph partitioning. Experimental results...

متن کامل

A General Graph Model for Representing Exact Communication Volume in Parallel Sparse Matrix-Vector Multiplication

In this paper, we present a new graph model of sparse matrix decomposition for parallel sparse matrix–vector multiplication. Our model differs from previous graph-based approaches in two main respects. Firstly, our model is based on edge colouring rather than vertex partitioning. Secondly, our model is able to correctly quantify and minimise the total communication volume of the parallel sparse...

متن کامل

Efficient Multicore Sparse Matrix-Vector Multiplication for Finite Element Electromagnetics on the Cell-BE processor

Multicore systems are rapidly becoming a dominant industry trend for accelerating electromagnetics computations, driving researchers to address parallel programming paradigms early in application development. We present a new sparse representation and a two level partitioning scheme for efficient sparse matrix-vector multiplication on multicore systems, and show results for a set of finite elem...

متن کامل

Data-parallel programming with Intel Array Building Blocks (ArBB)

Intel Array Building Blocks is a high-level data-parallel programming environment designed to produce scalable and portable results on existing and upcoming multiand many-core platforms. We have chosen several mathematical kernels a dense matrix-matrix multiplication, a sparse matrix-vector multiplication, a 1-D complex FFT and a conjugate gradients solver as synthetic benchmarks and representa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999